Search CORE

528 research outputs found

What Level of Quality can Neural Machine Translation Attain on Literary Text?

Author: Toral Antonio
Way Andy
Publication venue
Publication date: 01/01/2018
Field of study

Given the rise of a new approach to MT, Neural MT (NMT), and its promising performance on different text types, we assess the translation quality it can attain on what is perceived to be the greatest challenge for MT: literary text. Specifically, we target novels, arguably the most popular type of literary text. We build a literary-adapted NMT system for the English-to-Catalan translation direction and evaluate it against a system pertaining to the previous dominant paradigm in MT: statistical phrase-based MT (PBSMT). To this end, for the first time we train MT systems, both NMT and PBSMT, on large amounts of literary text (over 100 million words) and evaluate them on a set of twelve widely known novels spanning from the the 1920s to the present day. According to the BLEU automatic evaluation metric, NMT is significantly better than PBSMT (p < 0.01) on all the novels considered. Overall, NMT results in a 11% relative improvement (3 points absolute) over PBSMT. A complementary human evaluation on three of the books shows that between 17% and 34% of the translations, depending on the book, produced by NMT (versus 8% and 20% with PBSMT) are perceived by native speakers of the target language to be of equivalent quality to translations produced by a professional human translator.Comment: Chapter for the forthcoming book "Translation Quality Assessment: From Principles to Practice" (Springer

arXiv.org e-Print Archive

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Irish Universities

DCU Online Research Access Service

Dissertations of the University of Groningen

Post-editese:an Exacerbated Translationese

Author: Toral Antonio
Publication venue
Publication date: 01/08/2019
Field of study

ARTS repository - University of Groningen

An Italian to Catalan RBMT system reusing data from existing language pairs

Author: Ginestí-Rosell Mireia
Toral Antonio
Tyers Francis
Publication venue
Publication date: 01/01/2011
Field of study

This paper presents an Italian! Catalan RBMT system automatically built by combining the linguistic data of the existing pairs Spanish–Catalan and Spanish–Italian. A lightweight manual postprocessing is carried out in order to fix inconsistencies in the automatically derived dictionaries and to add very frequent words that are missing according to a corpus analysis. The system is evaluated on the KDE4 corpus and outperforms Google Translate by approximately ten absolute points in terms of both TER and GTM

DCU Online Research Access Service

Reassessing Claims of Human Parity and Super-Human Performance in Machine Translation at WMT 2019

Author: Toral Antonio
Publication venue: European Association for Machine Translation
Publication date: 01/01/2020
Field of study

We reassess the claims of human parity and super-human performance made at the news shared task of WMT 2019 for three translation directions: English-to-German, English-to-Russian and German-to-English. First we identify three potential issues in the human evaluation of that shared task: (i) the limited amount of intersentential context available, (ii) the limited translation proficiency of the evaluators and (iii) the use of a reference translation. We then conduct a modified evaluation taking these issues into account. Our results indicate that all the claims of human parity and super-human performance made at WMT 2019 should be refuted, except the claim of human parity for English-to-German. Based on our findings, we put forward a set of recommendations and open questions for future assessments of human parity in machine translation.Comment: Accepted at the 22nd Annual Conference of the European Association for Machine Translation (EAMT 2020

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Topic modeling-based domain adaptation for system combination

Author: Okita Tsuyoshi
Toral Antonio
van Genabith Josef
Publication venue
Publication date: 09/12/2012
Field of study

This paper gives the system description of the domain adaptation team of Dublin City University for our participation in the system combination task in the Second Workshop on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid MT (ML4HMT-12). We used the results of unsupervised document classification as meta information to the system combination module. For the Spanish-English data, our strategy achieved 26.33 BLEU points, 0.33 BLEU points absolute improvement over the standard confusion-network-based system combination. This was the best score in terms of BLEU among six participants in ML4HMT-12

CiteSeerX

Irish Universities

DCU Online Research Access Service